Tuning QoD in stream processing engines
نویسندگان
چکیده
Quality of Service (QoS) and Quality of Data (QoD) are the two major dimensions for evaluating any query processing system. In the context of data stream management systems (DSMSs), multi-query scheduling has been exploited to improve QoS. In this paper, we are proposing to exploit query scheduling to improve QoD in DSMSs. Specifically, we are presenting a new policy for scheduling multiple continuous queries with the objective of maximizing the freshness of the output data streams and hence the QoD of such outputs. The proposed Freshness-Aware Scheduling of Multiple Continuous Queries (FAS-MCQ) policy decides the execution order of continuous queries based on each query’s properties (i.e., cost and selectivity) as well the properties of the input update streams (i.e., variability of updates). Our experimental results have shown that FAS-MCQ can improve QoD by up to 50% compared to existing scheduling policies used in DSMSs. Finally, we propose and evaluate a parametrized version of our FAS-MCQ scheduler that is able to balance the trade-off between freshness and response time according to the application’s requirements.
منابع مشابه
Scheduling Multiple Continuous Queries to Improve QoD
Quality of Service (QoS) and Quality of Data (QoD) are the two major dimensions for evaluating any query processing system. In the context of the new data stream management stystems (DSMSs), multi-query scheduling has been exploited to improve QoS. In this paper, we are proposing to exploit scheduling to improve QoD. Specifically, we are presenting a new policy for scheduling multiple continuou...
متن کاملPRSP: A Plugin-based Framework for RDF Stream Processing
In this paper, we propose a plugin-based framework for RDF stream processing (PRSP). With this framework, we can apply SPARQL engines to process C-SPARQL queries with maintaining the high performance of those engines in a simple way. Taking advantage of PRSP, we can process large RDF streams in a distributed context via distributed SPARQL engines. Moreover, we can evaluate the performance and c...
متن کاملA Framework for Feeding Linked Data to Complex Event Processing Engines
A huge volume of Linked Data has been published on the Web, yet is not processable by Complex Event Processing (CEP) or Event Stream Processing (ESP) engines. This paper presents a framework to bridge this gap, under which Linked Data are first translated into events conforming to a lightweight ontology, and then fed to CEP engines. The event processing results will also be published back onto ...
متن کاملAdaptively Approximate Techniques in Distributed Architectures
The wealth of information generated by users interacting with the network and its applications is often underutilized due to complications in accessing heterogeneous and dynamic data and in retrieving relevant information from sources having possibly unknown formats and structures. Processing complex requests on such information sources is, thus, costly, though not guaranteeing user satisfactio...
متن کاملLoadstar: Load Shedding in Data Stream Mining
In this demo, we show that intelligent load shedding is essential in achieving optimum results in mining data streams under various resource constraints. The Loadstar system introduces load shedding techniques to classifying multiple data streams of large volume and high speed. Loadstar uses a novel metric known as the quality of decision (QoD) to measure the level of uncertainty in classificat...
متن کامل